{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# `Parameter Configurations`: parameter summary settings"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Global parameters"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To help users quickly manage the parameters, currently, we defined four global parameters. Global parameter applies to all the plots which has that parameter.\n",
    "\n",
    "| Global Parameter | Description |\n",
    "| --- | --- | \n",
    "| `width` | Change the plots' width in `plot(df, col1)`, `plot(df, col1, col2)`, `plot(df, col1, col2, col3)`, `plot_correlation()` and `plot_missing()`.\n",
    "| `height` | Change the plots' height in `plot(df, col1)`, `plot(df, col1, col2)` and `plot(df, col1, col2, col3)`, `plot_correlation()` and `plot_missing()`.\n",
    "| `bins` | Apply to `bins` for `Histogram`, `KDE Plot`, `Box Plot`, `Word Length`, `Line Chart`, `Spectrum`.\n",
    "| `ngroups` | Apply to `bars` and `slices` for the `Bar Chart` and `Pie Chart`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Local parameters"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Local parameters are plot-specified and the names are separated  by `.`. The portion before the first `.` is plot name and the portion after the first `.` is parameter name. The `.` is also used when the parameter name contains more than one word. When global parameter and local parameter are both entered by a user in config, the global parameter will be overwrote by local parameters for specific plots\n",
    "\n",
    "In the following tables we summarize the parameters for each API. You can also find the parameters for each plot in the Config API reference.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### `plot()`\n",
    "\n",
    "| Local Parameter | Type |Default | Description |\n",
    "| --- | --- | --- | --- |\n",
    "| `hist.enable` | bool | True | Whether enable `Histogram` |\n",
    "| `hist.height` | int | None | The height of the `Histogram`|\n",
    "| `hist.width` | int | None | The width of the `Histogram` |\n",
    "| `hist.bins` | int | 50 | Maximum number of bins to display in the `Histogram` |\n",
    "| `hist.yscale`| str | \"linear\" |  Y-axis scale (\"linear\" or \"log\") for the `Histogram` |\n",
    "| `bar.enable` | bool | True | Whether enable `Bar Chart` |\n",
    "| `bar.height` | int | None | The height of the `Bar Chart`|\n",
    "| `bar.width` | int | None | The width of the `Bar Chart`|\n",
    "| `bar.bars` | int | 10 | Maximum number of bars to display in the `Bar Chart` |\n",
    "| `bar.sort_descending` | bool | True|  Whether to sort the bars in descending order in the `Bar Chart`|\n",
    "| `bar.yscale` | str | \"linear\" |  Y-axis scale (\"linear\" or \"log\") for the `Bar Chart` |\n",
    "| `bar.color` | str | \"#1f77b4\" | Color of the bars in the `Bar Chart` |\n",
    "| `insight.enable` | bool | True | Whether enable `Insights` |\n",
    "| `insight.duplicates.threshold` | int | 1 | Warn if the percent of duplicated values is above this threshold in the `Insights`|\n",
    "| `insight.similar_distribution.threshold` | float | 0.05 | The significance level for Kolmogorov–Smirnov test in the `Insights` |\n",
    "| `insight.uniform.threshold` | float | 0.999 | The p-value threshold for chi-square test in the `Insights` |\n",
    "| `insight.missing.threshold` | int | 1 | Warn if the percent of missing values is above this threshold in the `Insights` |\n",
    "| `insight.skewed.threshold` | float | 1e-5 | The p-value for the scipy.skewtest which test whether the skew is different from the normal distributionin in the `Insights` |\n",
    "| `insight.infinity.threshold` | int | 1 | Warn if the percent of infinites is above this threshold in the `Insights` |\n",
    "| `insight.zeros.threshold` | int | 5 |  Warn if the percent of zeros is above this threshold in the `Insights` |\n",
    "| `insight.negatives.threshold` | int | 1 |  Warn if the percent of megatives is above this threshold in the `Insights` |\n",
    "| `insight.normal.threshold` | float | 0.99 | The p-value threshold for normal test, it is based on D’Agostino and Pearson’s test that combines skew and kurtosis to produce an omnibus test of normality in the `Insights` |\n",
    "| `insight.high_cardinality.threshold` | int | 50 | The threshold for unique values count, count larger than threshold yields high cardinality in the `Insights` |\n",
    "| `insight.constant.threshold` | int | 1 | The threshold for unique values count, count equals to threshold yields constant value in the `Insights` |\n",
    "| `insight.outstanding_no1.threshold` | float | 1.5 |The threshold for outstanding no1 insight, measures the ratio of the largest category count to the second-largest category count in the `Insights` |\n",
    "| `insight.attribution.threshold` | float | 0.5 | The threshold for the attribution insight, measures the percentage of the top 2 categories in the `Insights` |\n",
    "| `insight.high_word_cardinality.threshold` | int | 1000 | The threshold for the high word cardinality insight, which measures the number of words of that cateogory in the `Insights` |\n",
    "| `insight.outstanding_no1_word.threshold` | int | 0 | The threshold for the outstanding no1 word threshold, which measures the ratio of the most frequent word count to the second most frequent word count in the `Insights` |\n",
    "| `insight.outlier.threshold` | int | 0 | The threshold for the outlier count for the `Insights` in the `Box Plot`|\n",
    "| `kde.enable` | bool | True | Whether enable `KDE Plot` |\n",
    "| `kde.height` | int | None | The height of the `KDE Plot`|\n",
    "| `kde.width` | int | None | The width of the `KDE Plot` |\n",
    "| `kde.bins` | int | 50 | Maximum number of bins to display in the `KDE Plot` |\n",
    "| `kde.yscale`| str | \"linear\" | Y-axis scale (\"linear\" or \"log\") for the `KDE Plot` |\n",
    "| `kde.hist_color` | str | \"#aec7e8\" | Color of the histogram in the `KDE Plot` |\n",
    "| `kde.line_color` | str | \"#d62728\" | Color of the line in the `KDE Plot` |\n",
    "| `box.enable` | bool | True | Whether enable `Box Plot` |\n",
    "| `box.height` | int | None | The height of the `Box Plot`|\n",
    "| `box.width` | int | None | The width of the `Box Plot` |\n",
    "| `box.ngroups`| int | 15 | Maximum number of groups for categorical column to display in the `Box Plot` |\n",
    "| `box.bins`| int | 50 | Maximum number of bins for numerical column to display in the `Box Plot` |\n",
    "| `box.unit`| str | \"auto\" | Defines the time unit to group values over for a datetime column. It can be \"year\", \"quarter\", \"month\", \"week\", \"day\", \"hour\",\"minute\", \"second\". With default value \"auto\", it will use the time unit such that the resulting number of groups is closest to 15 in the `Box Plot` |\n",
    "| `box.sort_descending`| bool | True | Whether to sort the boxes in descending order of frequency in the `Box Plot` |\n",
    "| `box.color` | str | \"#d62728\" | Color of the `Box Plot` |\n",
    "| `value_table.enable` | bool | True | Whether enable `Value Table` |\n",
    "| `value_table.ngroups` | int | 10 | number of values to show in the `Value Table` |\n",
    "| `pie.enable` | bool | True | Whether enable `Pie Chart` |\n",
    "| `pie.height` | int | None | The height of the `Pie Chart`|\n",
    "| `pie.width` | int | None | The width of the `Pie Chart` |\n",
    "| `pie.slices`| int | 10 | Maximum number of pie slices to display in the `Pie Chart`|\n",
    "| `pie.sort_descending`| bool | True | Whether to sort the slices in descending order of frequency in the `Pie Chart`|\n",
    "| `pie.colors` | List[str] | None | Colors of the slices in the `Pie Chart` |\n",
    "| `wordcloud.enable` | bool | is_notebook | Whether enable `Word Cloud`. Default True if it's in notebook environment, else False |\n",
    "| `wordcloud.height` | int | None | The height of the `Word Cloud`|\n",
    "| `wordcloud.width` | int | None | The width of the `Word Cloud` |\n",
    "| `wordcloud.top_words`| int | 30 | Maximum number of most frequent words to display in the `Word Cloud`|\n",
    "| `wordcloud.stopword`| bool | True | Whether to remove stopwords in the `Word Cloud`|\n",
    "| `wordcloud.lemmatize`| bool | False |  Whether to lemmatize the words in the `Word Cloud`|\n",
    "| `wordcloud.stem`| bool | False |  Whether to apply Potter Stem on the words in the `Word Cloud`|\n",
    "| `wordfreq.top_words`| int | 30 | Maximum number of most frequent words to display in the `Word Frequency`|\n",
    "| `wordfreq.stopword`| bool | True | Whether to remove stopwords in the `Word Frequency`|\n",
    "| `wordfreq.lemmatize`| bool | False |  Whether to lemmatize the words in the `Word Frequency`|\n",
    "| `wordfreq.stem`| bool | False |  Whether to apply Potter Stem on the words in the `Word Frequency`|\n",
    "| `wordfreq.color` | str | \"#1f77b4\" | Color of the bars in the `Word Frequency Plot` |\n",
    "| `wordlen.bins`| int | 50 | Maximum number of bins in the `Word Length`|\n",
    "| `wordlen.yscale`| str | \"linear\" | Y-axis scale (\"linear\" or \"log\") for the `Word Length`|\n",
    "| `wordlen.color` | str | \"#aec7e8\" | Color of the bars in the `Word Length Plot` |\n",
    "| `line.enable` | bool | True | Whether enable `Line Chart` |\n",
    "| `line.height` | int | None | The height of the `Line Chart`|\n",
    "| `line.width` | int | None | The width of the `Line Chart` |\n",
    "| `line.bins`| int | 50 |  Maximum number of bins to display in the `Line Chart`|\n",
    "| `line.ngroups`| int | 10 | Maximum number of groups to display in the `Line Chart` |\n",
    "| `line.sort_descending`| bool | True | Whether to sort the groups in descending order of frequency in the `Line Chart` |\n",
    "| `line.yscale`| str | \"linear\" | Y-axis scale (\"linear\" or \"log\") for the `Line Chart` |\n",
    "| `line.unit`| str | \"auto\" | Defines the time unit to group values over for a datetime column. It can be \"year\", \"quarter\", \"month\", \"week\", \"day\", \"hour\", \"minute\", \"second\". With default value \"auto\", it will use the time unit such that the resulting number of groups is closest to 15 in the `Line Chart` |\n",
    "| `line.agg`| str | \"mean\" | Specify the aggregate to use when aggregating over a numeric column in the `Line Chart` |\n",
    "| `scatter.enable` | bool | True | Whether enable `Scatter Plot` |\n",
    "| `scatter.height` | int | None | The height of the `Scatter Plot`|\n",
    "| `scatter.width` | int | None | The width of the `Scatter Plot` |\n",
    "| `scatter.sample_size`| int | 1000 | Number of points to randomly sample per partition in the `Scatter Plot` |\n",
    "| `scatter.sample_rate`| float | None | Defines the sample rate per partition in the `Scatter Plot`. Cannot be used with `scatter.sample_size`. Set it to 1.0 for no sampling |\n",
    "| `hexbin.enable` | bool | True | Whether enable `Hexbin Plot` |\n",
    "| `hexbin.height` | int | None | The height of the `Hexbin Plot`|\n",
    "| `hexbin.width` | int | None | The width of the `Hexbin Plot` |\n",
    "| `hexbin.tile_size` | float | \"auto\" | The size of the tile in the hexbin plot. Measured from the middle of a hexagon to its left or right corner in the `Hexbin Plot`.|\n",
    "| `nested.enable` | bool | True | Whether enable `Nested Bar Chart` |\n",
    "| `nested.height` | int | None | The height of the `Nested Bar Chart`|\n",
    "| `nested.width` | int | None | The width of the `Nested Bar Chart` |\n",
    "| `nested.ngroups`| int | 10 | Maximum number of most frequent values from the first column to display in the `Nested Bar Chart` |\n",
    "| `nested.nsubgroups`| int | 5 | Maximum number of most frequent values from the second column to display (computed on the filtered data consisting of the most frequent values from the first column) in the `Nested Bar Chart` |\n",
    "| `stacked.enable` | bool | True | Whether enable `Stacked Bar Chart` |\n",
    "| `stacked.height` | int | None | The height of the `Stacked Bar Chart`|\n",
    "| `stacked.width` | int | None | The width of the `Stacked Bar Chart` |\n",
    "| `stacked.ngroups`| int | 10 | Maximum number of most frequent values from the first column to display in the `Stacked Bar Chart` |\n",
    "| `stacked.nsubgroups`| int | 5 | Maximum number of most frequent values from the second column to display (computed on the filtered data consisting of the most frequent values from the first column) in the `Stacked Bar Chart` |\n",
    "| `stacked.unit`| str | \"auto\" |         Defines the time unit to group values over for a datetime column. It can be \"year\", \"quarter\", \"month\", \"week\", \"day\", \"hour\", \"minute\", \"second\". With default value \"auto\", it will use the time unit such that the resulting number of groups is closest to 15 in the `Stacked Bar Chart` |\n",
    "| `stacked.sort_descending`| bool | True | Whether to sort the groups in descending order of frequency in the `Stacked Bar Chart` |\n",
    "| `heatmap.enable` | bool | True | Whether enable `Heat Map` |\n",
    "| `heatmap.height` | int | None | The height of the `Heat Map`|\n",
    "| `heatmap.width` | int | None | The width of the `Heat Map` |\n",
    "| `heatmap.ngroups`| int | 10 | Maximum number of most frequent values from the first column to display in the `Heat Map` |\n",
    "| `heatmap.nsubgroups`| int | 5 | Maximum number of most frequent values from the second column to display (computed on the filtered data consisting of the most frequent values from the first column)in the `Heat Map` |\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### `plot_missing()`\n",
    "\n",
    "| Local Parameter | Type |Default | Description |\n",
    "| --- | --- | --- | --- |\n",
    "| `spectrum.bins` | int | 20 | Maximum number of bins to display in the `Spectrum` |\n",
    "|`PDF.sample_size` | int | 100 | Number of evenly spaced samples between the minimum and maximum values to compute the `PDF` at |\n",
    "|`CDF.sample_size` | int | 100 | Number of evenly spaced samples between the minimum and maximum values to compute the `CDF` at |\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### `plot_correlation()`\n",
    "\n",
    "| Local Parameter | Type |Default | Description |\n",
    "| --- | --- | --- | --- |\n",
    "| `scatter.sample_size`| int | 1000 | Number of points to randomly sample per partition in the `Scatter Plot` in `plot_correlation(df, x, y)`|\n",
    "| `scatter.sample_rate`| float | None | Defines the sample rate per partition in the `Scatter Plot`. Cannot be used with `scatter.sample_size`. Set it to 1.0 for no sampling |"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### `create_report()`\n",
    "\n",
    "| Local Parameter | Type |Default | Description |\n",
    "| --- | --- | --- | --- |\n",
    "| `overview.enable`| bool | True | Whether enable the `Overview` section in the report |\n",
    "| `variables.enable`| bool | True | Whether enable the `Variables` section in the report|\n",
    "| `interactions.enable`| bool | True | Whether enable the `Interactions` section in the report |\n",
    "| `interactions.cat_enable`| bool | False | Whether enable the categorical columns in the interaction section|\n",
    "| `correlations.enable`| bool | True | Whether enable the `Correlations` section in the report |\n",
    "| `missingvalues.enable`| bool | True | Whether enable the `Missing Values` section in the report |\n",
    "| `bar.enable` | bool | True | Whether enable `Bar Chart` |\n",
    "| `bar.bars` | int | 10 | Maximum number of bars to display in the `Bar Chart` |\n",
    "| `bar.sort_descending` | bool | True|  Whether to sort the bars in descending order in the `Bar Chart`|\n",
    "| `bar.yscale` | str | \"linear\" |  Y-axis scale (\"linear\" or \"log\") for the `Bar Chart` |\n",
    "| `pie.enable` | bool | True | Whether enable `Pie Chart` |\n",
    "| `pie.slices`| int | 10 | Maximum number of pie slices to display in the `Pie Chart`|\n",
    "| `pie.sort_descending`| bool | True | Whether to sort the slices in descending order of frequency in the `Pie Chart`|\n",
    "| `wordcloud.enable` | bool | is_notebook | Whether enable `Word Cloud`. Default True if it's in notebook environment, else False |\n",
    "| `wordcloud.top_words`| int | 30 | Maximum number of most frequent words to display in the `Word Cloud`|\n",
    "| `wordcloud.stopword`| bool | True | Whether to remove stopwords in the `Word Cloud`|\n",
    "| `wordcloud.lemmatize`| bool | False |  Whether to lemmatize the words in the `Word Cloud`|\n",
    "| `wordcloud.stem`| bool | False |  Whether to apply Potter Stem on the words in the `Word Cloud`|\n",
    "| `wordfreq.top_words`| int | 30 | Maximum number of most frequent words to display in the `Word Frequency`|\n",
    "| `wordcloud.stopword`| bool | True | Whether to remove stopwords in the `Word Frequency`|\n",
    "| `wordcloud.lemmatize`| bool | False |  Whether to lemmatize the words in the `Word Frequency`|\n",
    "| `wordcloud.stem`| bool | False |  Whether to apply Potter Stem on the words in the `Word Frequency`|\n",
    "| `wordlen.bins`| int | 50 | Maximum number of bins in the `Word Length`|\n",
    "| `wordlen.yscale`| str | \"linear\" | Y-axis scale (\"linear\" or \"log\") for the `Word Length`|\n",
    "| `line.enable` | bool | True | Whether enable `Line Chart` |\n",
    "| `line.unit`| str | \"auto\" | Defines the time unit to group values over for a datetime column. It can be \"year\", \"quarter\", \"month\", \"week\", \"day\", \"hour\", \"minute\", \"second\". With default value \"auto\", it will use the time unit such that the resulting number of groups is closest to 15 in the `Line Chart` |\n",
    "| `kde.enable` | bool | True | Whether enable `KDE Plot` |\n",
    "|`kde.bins` | int | 50 | Maximum number of bins in the `KDE Plot` |\n",
    "| `kde.yscale`| str | \"linear\" | Y-axis scale (\"linear\" or \"log\") for the `KDE Plot` |\n",
    "| `box.enable` | bool | True | Whether enable `Box Plot` |\n",
    "| `box.sort_descending`| bool | True | Whether to sort the boxes in descending order of frequency in the `Box Plot` |\n",
    "| `spectrum.enable` | bool | True | Whether enable `Spectrum`|\n",
    "| `spectrum.bins` | int | 20 | Maximum number of bins to display in the `Spectrum` |"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
